Hilbert Envelope Based Features for Far-Field Speech Recognition
نویسندگان
چکیده
Automatic speech recognition (ASR) systems, trained on speech signals from close-talking microphones, generally fail in recognizing far-field speech. In this paper, we present a Hilbert Envelope based feature extraction technique to alleviate the artifacts introduced by room reverberations. The proposed technique is based on modeling temporal envelopes of the speech signal in narrow sub-bands using Frequency Domain Linear Prediction (FDLP). ASR experiments on far-field speech using the proposed FDLP features show significant performance improvements when compared to other robust feature extraction techniques (average relative improvement of 43% in word error rate).
منابع مشابه
Modulation Spectrum Analysis for Recognition of Reverberant Speech
Recognition of reverberant speech constitutes a challenging problem for typical speech recognition systems. This is mainly due to the conventional short-term analysis/compensation techniques. In this paper, we present a feature extraction technique based on modeling long segments of temporal envelopes of the speech signal in narrow sub-bands using frequency domain linear prediction (FDLP). FDLP...
متن کاملFepstrum Features: Design and Application to Conversational Speech Recognition
In this paper, we present the Fepstrum features – a principled approach to estimate the modulation spectrum of the speech signals using the Hilbert envelopes in a nonparametric way. The importance of the modulation spectrum as a feature in the automatic speech recognition (ASR) has long been established by several researchers in the past twothree decades. However, traditionally, in the speech r...
متن کاملHilbert envelope based spectro-temporal features for phoneme recognition in telephone speech
In this paper, we present a spectro-temporal feature extraction technique using sub-band Hilbert envelopes of relatively long segments of speech signal. Hilbert envelopes of the sub-bands are estimated using Frequency Domain Linear Prediction (FDLP). Spectral features are derived by integrating the sub-band Hilbert envelopes in short-term frames and the temporal features are formed by convertin...
متن کاملEnvelope-based inter-aural time difference localization training to improve speech-in-noise perception in the elderly
Background: Many elderly individuals complain of difficulty in understanding speech in noise despite having normal hearing thresholds. According to previous studies, auditory training leads to improvement in speech-in-noise perception, but these studies did not consider the etiology, so their results cannot be generalized. The present study aimed at investigating the effectiveness of envelope-b...
متن کاملEEMD-Based Speaker Automatic Emotional Recognition in Chinese Mandarin
Emotion feature extraction is the key to speech emotional recognition. And ensemble empirical mode decomposition(EEMD) is a newly developed method aimed at eliminating emotion mode mixing present in the original empirical mode decomposition(EMD). To evaluate the performance of this new method, this paper investigates the effect of a parameters pertinent to EEMD: speech emotional envelope. First...
متن کامل